Process Substitution
   HOME

TheInfoList



OR:

In computing, process substitution is a form of
inter-process communication In computer science, inter-process communication or interprocess communication (IPC) refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data. Typically, applications can use IPC, categori ...
that allows the input or output of a command to appear as a file. The command is substituted in-line, where a file name would normally occur, by the
command shell In computing, a shell is a computer program that exposes an operating system's services to a human user or other programs. In general, operating system shells use either a command-line interface (CLI) or graphical user interface (GUI), depending ...
. This allows programs that normally only accept files to directly read from or write to another program.


History

Process substitution was available as a compile-time option for ksh88, the 1988 version of the
KornShell KornShell (ksh) is a Unix shell which was developed by David Korn at Bell Labs in the early 1980s and announced at USENIX on July 14, 1983. The initial development was based on Bourne shell source code. Other early contributors were Bell ...
from
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...
. The rc shell provides the feature as "pipeline branching" in Version 10 Unix, released in 1990. The
Bash shell Bash is a Unix shell and command language written by Brian Fox for the GNU Project as a free software replacement for the Bourne shell. First released in 1989, it has been used as the default login shell for most Linux distributions. Bash was o ...
provided process substitution no later than version 1.14, released in 1994. Available in the
Gnu source archive of version 1.14.7
as of 12 February 2016.


Example

The following examples use KornShell syntax. The
Unix Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
diff In computing, the utility diff is a data comparison tool that computes and displays the differences between the contents of files. Unlike edit distance notions used for other purposes, diff is line-oriented rather than character-oriented, but it ...
command normally accepts the names of two files to compare, or one file name and standard input. Process substitution allows one to compare the output of two programs directly: $ diff <(sort file1) <(sort file2) The <(command) expression tells the command interpreter to run ''command'' and make its output appear as a file. The ''command'' can be any arbitrarily complex shell command. Without process substitution, the alternatives are: Both alternatives are more cumbersome. Process substitution can also be used to capture output that would normally go to a file, and redirect it to the input of a process. The Bash syntax for writing to a process is >(command). Here is an example using the
tee A tee is a stand used in sport to support and elevate a stationary ball prior to striking with a foot, club or bat. Tees are used extensively in golf, tee-ball, baseball, American football, and rugby. Etymology The word tee is derived from the ...
, wc and
gzip gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and in ...
commands that counts the lines in a file with wc -l and compresses it with gzip in one pass: $ tee >(wc -l >&2) < bigfile , gzip > bigfile.gz


Advantages

The main advantages of process substitution over its alternatives are: * Simplicity: The commands can be given in-line; there is no need to save temporary files or create named pipes first. * Performance: Reading directly from another process is often faster than having to write a temporary file to disk, then read it back in. This also saves disk space. * Parallelism: The substituted process can be running concurrently with the command reading its output or writing its input, taking advantage of
multiprocessing Multiprocessing is the use of two or more central processing units (CPUs) within a single computer system. The term also refers to the ability of a system to support more than one processor or the ability to allocate tasks between them. There ar ...
to reduce the total time for the computation.


Mechanism

Under the hood, process substitution has two implementations. On systems which support /dev/fd (most Unix-like systems) it works by calling the pipe() system call, which returns a file descriptor $fd for a new anonymous pipe, then creating the string /dev/fd/$fd, and substitutes that on the command line. On systems without /dev/fd support, it calls mkfifo with a new temporary filename to create a named pipe, and substitutes this filename on the command line. To illustrate the steps involved, consider the following simple command substitution on a system with /dev/fd support: $ diff file1 <(sort file2) The steps the shell performs are: # Create a new anonymous pipe. This pipe will be accessible with something like /dev/fd/63; you can see it with a command like echo <(true). # Execute the substituted command in the background (sort file2 in this case), piping its output to the anonymous pipe. # Execute the primary command, replacing the substituted command with the path of the anonymous pipe. In this case, the full command might expand to something like diff file1 /dev/fd/63. # When execution is finished, close the anonymous pipe. For named pipes, the execution differs solely in the creation and deletion of the pipe; they are created with mkfifo (which is given a new temporary file name) and removed with unlink. All other aspects remain the same.


Limitations

The "files" created are not seekable, which means the process reading or writing to the file cannot perform
random access Random access (more precisely and more generally called direct access) is the ability to access an arbitrary element of a sequence in equal time or any datum from a population of addressable elements roughly as easily and efficiently as any othe ...
; it must read or write once from start to finish. Programs that explicitly check the type of a file before opening it may refuse to work with process substitution, because the "file" resulting from process substitution is not a
regular file The seven standard Unix file types are ''regular'', ''directory'', ''symbolic link'', ''FIFO special'', ''block special'', ''character special'', and ''socket'' as defined by POSIX. Different OS-specific implementations allow more types than what PO ...
. Additionally, up to Bash 4.4 (released September 2016), it was not possible to obtain the exit code of a process substitution command from the shell that created the process substitution.


See also

*
Pipeline (Unix) In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (''stdou ...
*
Named pipe In computing, a named pipe (also known as a FIFO for its behavior) is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication (IPC). The concept is also found in OS/2 and ...
*
Command substitution In computing, command substitution is a facility that allows a command to be run and its output to be pasted back on the command line as arguments to another command. Command substitution first appeared in the Bourne shell, introduced with Version ...
*
Comparison of command shells A command shell is a command-line interface to interact with and manipulate a computer's operating system. General characteristics Interactive features Background execution Background execution allows a shell to run a command without use ...
*
Anonymous pipe In computer science, an anonymous pipe is a simplex FIFO communication channel that may be used for one-way interprocess communication (IPC). An implementation is often integrated into the operating system's file IO subsystem. Typically a parent ...


References


Further reading

* * * {{cite web , url=http://www.linuxjournal.com/content/shell-process-redirection , title=Bash Process Substitution , first=Mitch , last=Frazier , work=Linux Journal , date=22 May 2008 , accessdate=1 Oct 2011 Programming language topics Unix programming tools